Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-108866: Change optimizer API and contract #109144

Closed

Conversation

markshannon
Copy link
Member

@markshannon markshannon commented Sep 8, 2023

Partial implementation of #108866

  • Changes the return type of the execute function from _PyInterpreterFrame * to _Py_CODEUNIT *, returning the next instruction to execute
  • The execute function always executes at least one instruction.

Since the uop and optimizer and executor start at the top of the loop, starting at the JUMP_BACKWARD means that they always execute one or more instructions.
ENTER_EXECUTOR is more efficient as it doesn't need to worry about edge cases like executing no instructions or handling INSTRUMENTED_LINE.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think about this, the more I think that the problem with signature of the executor is that it has too many return values:

  • An error indicator
  • The frame where the error (or exit) occurred
  • The instruction pointer where it occurred
  • The stack pointer at that time

Both the frame and the instruction pointer are needed to continue execution and also for error handling, and both can differ from the value before the executor runs. Same for the stack pointer (though we always sync that through the frame object).

Things would be neater if these were all explicit and we didn't have to recover one or the other through tstate->current_frame.

The optimizer inherits this awkwardness because it promises to also run the executor. (And why is that? It seems to make things more complicated, and it means the opcode isn't replaced until after that first run -- won't that confuse other threads?)

The requirement to include the JUMP_BACKWARD instruction in the input to the optimizer causes a bunch of extra work which I'm also not keen on. And, despite what I said in in gh-108866, I actually still don't completely understand why we need this.

In any case maybe we need to split the two changes.

Python/bytecodes.c Outdated Show resolved Hide resolved
Comment on lines 2225 to 2228
while(oparg > 255) {
oparg >>= 8;
src--;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever. I'd add assert(src->op.code == EXTENDED_ARG).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this assert is correct, since it could be an INSTRUMENTED_LINE or something. Ditto for the matching assert in _PyOptimizer_BackEdge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. Can instrumentation really overwrite EXTENDED_ARG? @markshannon

Python/optimizer.c Outdated Show resolved Hide resolved
@markshannon
Copy link
Member Author

The frame pointer, instruction pointer and stack pointer are all part of the VM state.
We either need to store them in memory, or pass them as an argument/return value when calling or returning from a C function (or jitted code).

The only reason we passed these values as arguments or return value was for efficiency. It reduces the number of memory accesses at the point of transfer. If it turns out that most callees don't need them, we can just remove the parameter from the API.

The difference with this PR is that the return value is meaningful, not just a cached value. The next_instr cannot be derived from the prev_instr stored on the frame. If the last instruction was a jump then next_instr != prev_instr + 1.

Things would be neater if these were all explicit and we didn't have to recover one or the other through tstate->current_frame.

tstate->current_frame is the current frame and holds the previous instruction and stack pointers.
Unless we are passing values in registers for performance, we don't want to make copies that could get out of sync.

@gvanrossum
Copy link
Member

Thanks for the changes.

I believe we decided offline to hold off on this until Brandt has had an opportunity to change _PyOptimizer_BackEdge not to call the executor, and to modify JUMP_BACKWARD instead to "deoptimize" to ENTER_EXECUTOR. (Or to find out that there was a reason why things are the way they are.)

@markshannon
Copy link
Member Author

This is now obsolete

@markshannon markshannon deleted the optimizer-return-next-instr branch January 18, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants